<!DOCTYPE html>
<html lang="si">
<head>
    <meta http-equiv="content-type" content="text/html;charset=utf-8"/>
    <meta name="viewport" content="width=device-width, initial-scale=1.0"/>
    <meta name="description" content="මෙය ප්රයිමර් හි විචිත ක්රියාත්මක කිරීම/නිබන්ධනයකි: PyTorch හි දැක්ම සඳහා භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම."/>

    <meta name="twitter:card" content="summary"/>
    <meta name="twitter:image:src" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
    <meta name="twitter:title" content="ප්රාථමිකය: භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම"/>
    <meta name="twitter:description" content="මෙය ප්රයිමර් හි විචිත ක්රියාත්මක කිරීම/නිබන්ධනයකි: PyTorch හි දැක්ම සඳහා භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම."/>
    <meta name="twitter:site" content="@labmlai"/>
    <meta name="twitter:creator" content="@labmlai"/>

    <meta property="og:url" content="https://nn.labml.ai/transformers/primer_ez/index.html"/>
    <meta property="og:title" content="ප්රාථමිකය: භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම"/>
    <meta property="og:image" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
    <meta property="og:site_name" content="ප්රාථමිකය: භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම"/>
    <meta property="og:type" content="object"/>
    <meta property="og:title" content="ප්රාථමිකය: භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම"/>
    <meta property="og:description" content="මෙය ප්රයිමර් හි විචිත ක්රියාත්මක කිරීම/නිබන්ධනයකි: PyTorch හි දැක්ම සඳහා භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම."/>

    <title>ප්රාථමිකය: භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම</title>
    <link rel="shortcut icon" href="/icon.png"/>
    <link rel="stylesheet" href="../../pylit.css?v=1">
    <link rel="canonical" href="https://nn.labml.ai/transformers/primer_ez/index.html"/>
    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.13.18/dist/katex.min.css" integrity="sha384-zTROYFVGOfTw7JV7KUu8udsvW2fx4lWOsCEDqhBreBwlHI4ioVRtmIvEThzJHGET" crossorigin="anonymous">

    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=G-4V3HC8HBLH"></script>
    <script>
        window.dataLayer = window.dataLayer || [];

        function gtag() {
            dataLayer.push(arguments);
        }

        gtag('js', new Date());

        gtag('config', 'G-4V3HC8HBLH');
    </script>
</head>
<body>
<div id='container'>
    <div id="background"></div>
    <div class='section'>
        <div class='docs'>
            <p>
                <a class="parent" href="/">home</a>
                <a class="parent" href="../index.html">transformers</a>
                <a class="parent" href="index.html">primer_ez</a>
            </p>
            <p>
                <a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations" target="_blank">
                    <img alt="Github"
                         src="https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social"
                         style="max-width:100%;"/></a>
                <a href="https://twitter.com/labmlai" rel="nofollow" target="_blank">
                    <img alt="Twitter"
                         src="https://img.shields.io/twitter/follow/labmlai?style=social"
                         style="max-width:100%;"/></a>
            </p>
            <p>
                <a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/transformers/primer_ez/__init__.py" target="_blank">
                    View code on Github</a>
            </p>
        </div>
    </div>
    <div class='section' id='section-0'>
        <div class='docs doc-strings'>
            <div class='section-link'>
                <a href='#section-0'>#</a>
            </div>
            <h1>ප්රාථමිකය: භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම</h1>
<p>මෙය <a href="https://pytorch.org">PyTorch</a> කඩදාසි <a href="https://papers.labml.ai/paper/2109.08668">ප්රයිමර් ක්රියාත්මක කිරීම: භාෂා ආකෘති නිර්මාණය සඳහා කාර්යක්ෂම ට්රාන්ස්ෆෝමර් සෙවීම</a> . </p>
<p>කතුවරුන්ට්රාන්ස්ෆෝමර් ගෘහ නිර්මාණ ශිල්පය සඳහා පරිණාමීය සෙවීමක් කරයි. සෙවුම් ප්රයිමර් භාවිතයෙන් සොයාගත් ගෘහ නිර්මාණ ශිල්පය ඔවුන් නම් කරයි (ප්රයිම්ටිව්ස් ට්රාන්ස්ෆෝමර් සෙවූ). <strong>ප්රයිමර්EZ</strong> යනු මුල් ට්රාන්ස්ෆෝමරය හා සසඳන විට ප්රයිමර් හි වඩාත්ම ශක්තිමත් වෙනස් කිරීම් දෙක සහිත ගෘහ නිර්මාණ ශිල්පයයි. ප්රයිමර් EZ වැනිලා ට්රාන්ස්ෆෝමරය වඩා වේගයෙන් දුම්රිය. </p>
<h3>වර්ගRelu</h3>
<p>සෙවුමමගින් සොයාගත් වඩාත්ම effective ලදායී වෙනස් කිරීම වන්නේ <a href="../feed_forward.html">ස්ථාන-නැණවත් පෝෂක මොඩියුලයේ</a>RelU වෙනුවට චතුරස්රාකාර RelU භාවිතා කිරීමයි. </p>
<p><span ><span class="katex-display"><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:1.204008em;vertical-align:-0.25em;"></span><span class="mord coloredeq eqa" style=""><span class="mord mathnormal" style="margin-right:0.03588em">y</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel" style="">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mord" style=""><span class="mord" style=""><span class="mop" style=""><span style="">m</span><span style="">a</span><span style="">x</span></span><span class="mopen" style="">(</span><span class="mord mathnormal" style="">x</span><span class="mpunct" style="">,</span><span class="mspace" style="margin-right:0.16666666666666666em"></span><span class="mord" style="">0</span><span class="mclose" style="">)</span></span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.954008em;"><span style="top:-3.2029em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight" style=""><span class="mord mtight" style="">2</span></span></span></span></span></span></span></span></span></span></span></span></span></span></p>
<h3>බහු-Dconv-හිසඅවධානය (MDHA)</h3>
<p>ඊළඟeffective ලදායී වෙනස් කිරීම යනු විමසුම්, යතුරු සහ අගයන් සඳහා බහු-හිස ප්රක්ෂේපණය කිරීමෙන් පසු ගැඹුරට අනුව <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord">3</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">1</span></span></span></span></span> කැටි ගැසුමකි. මෙම convolution අනුක්රමය මානයක් ඔස්සේ සහ නාලිකාව (ගැඹුර-අනුව) වේ. පැහැදිලි කිරීම සඳහා, එක් එක් හිසෙහි නාලිකා ගණන සංකෝචනය නම් එක් එක් <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.84444em;vertical-align:-0.15em;"></span><span class="mord coloredeq eqd" style=""><span class="mord" style=""><span class="mord mathnormal" style="">d</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.33610799999999996em;"><span style="top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight" style=""><span class="mord mathnormal mtight" style="margin-right:0.03148em">k</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span></span></span> නාලිකා සඳහා <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">3</span></span></span></span></span> කර්නල් ඇත. <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.84444em;vertical-align:-0.15em;"></span><span class="mord coloredeq eqd" style=""><span class="mord" style=""><span class="mord mathnormal" style="">d</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.33610799999999996em;"><span style="top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight" style=""><span class="mord mathnormal mtight" style="margin-right:0.03148em">k</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span></span></span> </p>
<p>ප්රයිමර්EZ සඳහා<a href="experiment.html">අත්හදා බැලීමේ කේතය මෙන්න</a>. </p>
<p><a href="https://app.labml.ai/run/30adb7aa1ab211eca7310f80a114e8a4"><img alt="View Run" src="https://img.shields.io/badge/labml-experiment-brightgreen"></a></p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">40</span><span></span><span class="kn">import</span> <span class="nn">torch</span>
<span class="lineno">41</span><span class="kn">from</span> <span class="nn">torch</span> <span class="kn">import</span> <span class="n">nn</span>
<span class="lineno">42</span>
<span class="lineno">43</span><span class="kn">from</span> <span class="nn">labml_helpers.module</span> <span class="kn">import</span> <span class="n">Module</span>
<span class="lineno">44</span><span class="kn">from</span> <span class="nn">labml_nn.transformers</span> <span class="kn">import</span> <span class="n">MultiHeadAttention</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-1'>
        <div class='docs doc-strings'>
            <div class='section-link'>
                <a href='#section-1'>#</a>
            </div>
            <h2>වර්ගRelu සක්රිය</h2>
<p><span ><span class="katex-display"><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:1.204008em;vertical-align:-0.25em;"></span><span class="mord coloredeq eqa" style=""><span class="mord mathnormal" style="margin-right:0.03588em">y</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel" style="">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mord" style=""><span class="mord" style=""><span class="mop" style=""><span style="">m</span><span style="">a</span><span style="">x</span></span><span class="mopen" style="">(</span><span class="mord mathnormal" style="">x</span><span class="mpunct" style="">,</span><span class="mspace" style="margin-right:0.16666666666666666em"></span><span class="mord" style="">0</span><span class="mclose" style="">)</span></span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.954008em;"><span style="top:-3.2029em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight" style=""><span class="mord mtight" style="">2</span></span></span></span></span></span></span></span></span></span></span></span></span></span></p>
<p>SquaredRelU <a href="../feed_forward.html">තත්ත්වය නැණවත් feedforward මොඩියුලය</a>තුළ සක්රිය කාර්යය ලෙස භාවිතා කරයි. </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">47</span><span class="k">class</span> <span class="nc">SquaredReLU</span><span class="p">(</span><span class="n">Module</span><span class="p">):</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-2'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-2'>#</a>
            </div>
            
        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">57</span>    <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="lineno">58</span>        <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span>
<span class="lineno">59</span>        <span class="bp">self</span><span class="o">.</span><span class="n">relu</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">ReLU</span><span class="p">()</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-3'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-3'>#</a>
            </div>
            
        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">61</span>    <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">):</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-4'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-4'>#</a>
            </div>
            <p>Reluඅයදුම් කරන්න </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">63</span>        <span class="n">x</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">relu</span><span class="p">(</span><span class="n">x</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-5'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-5'>#</a>
            </div>
            <p>එයවර්ග කරන්න </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">65</span>        <span class="k">return</span> <span class="n">x</span> <span class="o">*</span> <span class="n">x</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-6'>
        <div class='docs doc-strings'>
            <div class='section-link'>
                <a href='#section-6'>#</a>
            </div>
            <h2>අවකාශීයගැඹුර නැණවත් සම්මුතිය</h2>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">68</span><span class="k">class</span> <span class="nc">SpatialDepthWiseConvolution</span><span class="p">(</span><span class="n">Module</span><span class="p">):</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-7'>
        <div class='docs doc-strings'>
            <div class='section-link'>
                <a href='#section-7'>#</a>
            </div>
            <ul><li><code  class="highlight"><span></span><span class="n">d_k</span></code>
 එක් එක් හිසෙහි නාලිකා ගණන වේ</li></ul>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">73</span>    <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">d_k</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">kernel_size</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">3</span><span class="p">):</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-8'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-8'>#</a>
            </div>
            
        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">77</span>        <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span>
<span class="lineno">78</span>        <span class="bp">self</span><span class="o">.</span><span class="n">kernel_size</span> <span class="o">=</span> <span class="n">kernel_size</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-9'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-9'>#</a>
            </div>
            <p>අපිපයිටෝර්ච් <code  class="highlight"><span></span><span class="n">Conv1d</span></code>
 මොඩියුලය භාවිතා කරමු. එක් එක් නාලිකාව සඳහා වෙනම සංවහනයක් (විවිධ කර්නල් සමඟ) සිදු වන පරිදි නාලිකා ගණනට සමාන වන පරිදි කණ්ඩායම් ගණන අපි සකස් කරමු. අපි දෙපැත්තටම පුරවන එකතු කර පසුව නිවැරදි වඩාත්ම <code  class="highlight"><span></span><span class="n">kernel_size</span> <span class="o">-</span> <span class="mi">1</span></code>
 ප්රති </p>. ල ලබා ගනිමු

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">83</span>        <span class="bp">self</span><span class="o">.</span><span class="n">conv</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Conv1d</span><span class="p">(</span><span class="n">in_channels</span><span class="o">=</span><span class="n">d_k</span><span class="p">,</span> <span class="n">out_channels</span><span class="o">=</span><span class="n">d_k</span><span class="p">,</span>
<span class="lineno">84</span>                              <span class="n">kernel_size</span><span class="o">=</span><span class="p">(</span><span class="n">kernel_size</span><span class="p">,),</span> <span class="n">padding</span><span class="o">=</span><span class="p">(</span><span class="n">kernel_size</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,),</span> <span class="n">groups</span><span class="o">=</span><span class="n">d_k</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-10'>
        <div class='docs doc-strings'>
            <div class='section-link'>
                <a href='#section-10'>#</a>
            </div>
            <p> <code  class="highlight"><span></span><span class="n">x</span></code>
 හැඩය ඇත <code  class="highlight"><span></span><span class="p">[</span><span class="n">seq_len</span><span class="p">,</span> <span class="n">batch_size</span><span class="p">,</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span><span class="p">]</span></code>
</p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">86</span>    <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">):</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-11'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-11'>#</a>
            </div>
            <p>හැඩයලබා ගන්න </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">92</span>        <span class="n">seq_len</span><span class="p">,</span> <span class="n">batch_size</span><span class="p">,</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">shape</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-12'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-12'>#</a>
            </div>
            <p>කිරීමටඅවසර <code  class="highlight"><span></span><span class="p">[</span><span class="n">batch_size</span><span class="p">,</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span><span class="p">,</span> <span class="n">seq_len</span><span class="p">]</span></code>
 </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">94</span>        <span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">permute</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-13'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-13'>#</a>
            </div>
            <p>හැඩයවෙනස් කරන්න <code  class="highlight"><span></span><span class="p">[</span><span class="n">batch_size</span> <span class="o">*</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span><span class="p">,</span> <span class="n">seq_len</span><span class="p">]</span></code>
 </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">96</span>        <span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="n">batch_size</span> <span class="o">*</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span><span class="p">,</span> <span class="n">seq_len</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-14'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-14'>#</a>
            </div>
            <p>1Dකැටි ගැසුමේ ආකෘතියේ ආදානය පිළිගනී <code  class="highlight"><span></span><span class="p">[</span><span class="n">N</span><span class="p">,</span> <span class="n">channels</span><span class="p">,</span> <span class="n">sequence</span><span class="p">]</span></code>
 </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">99</span>        <span class="n">x</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">conv</span><span class="p">(</span><span class="n">x</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-15'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-15'>#</a>
            </div>
            <p>අපිදෙපැත්තෙන්ම ආවරණයකින් සමන්විත බැවින් නිවැරදි වඩාත්ම <code  class="highlight"><span></span><span class="n">kernel_size</span> <span class="o">-</span> <span class="mi">1</span></code>
 ප්රති results ල බෝග කරන්න </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">101</span>        <span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:</span><span class="o">-</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">kernel_size</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)]</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-16'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-16'>#</a>
            </div>
            <p>නැවතහැඩගස්වන්න <code  class="highlight"><span></span><span class="p">[</span><span class="n">batch_size</span><span class="p">,</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span><span class="p">,</span> <span class="n">seq_len</span><span class="p">]</span></code>
 </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">103</span>        <span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="n">batch_size</span><span class="p">,</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span><span class="p">,</span> <span class="n">seq_len</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-17'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-17'>#</a>
            </div>
            <p>කිරීමටඅවසර <code  class="highlight"><span></span><span class="p">[</span><span class="n">seq_len</span><span class="p">,</span> <span class="n">batch_size</span><span class="p">,</span> <span class="n">heads</span><span class="p">,</span> <span class="n">d_k</span><span class="p">]</span></code>
 </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">105</span>        <span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">permute</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-18'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-18'>#</a>
            </div>
            <p> </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">108</span>        <span class="k">return</span> <span class="n">x</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-19'>
        <div class='docs doc-strings'>
            <div class='section-link'>
                <a href='#section-19'>#</a>
            </div>
            <h2>බහු-Dconv-හිසඅවධානය (MDHA)</h2>
<p>අපි <a href="../mha.html#MHA">බහු-ප්රධාන අවධානය</a> අපගේ මුල් ක්රියාත්මක කිරීම දීර් and කරන අතර විමසුම, යතුර සහ වටිනාකම් ප්රක්ෂේපණ සඳහා අවකාශීය ගැඹුර-නැණවත් සම්මුතිය එක් කරමු. </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">111</span><span class="k">class</span> <span class="nc">MultiDConvHeadAttention</span><span class="p">(</span><span class="n">MultiHeadAttention</span><span class="p">):</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-20'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-20'>#</a>
            </div>
            
        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">119</span>    <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">heads</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">d_model</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">dropout_prob</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.1</span><span class="p">):</span>
<span class="lineno">120</span>        <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">heads</span><span class="p">,</span> <span class="n">d_model</span><span class="p">,</span> <span class="n">dropout_prob</span><span class="p">)</span></pre></div>
        </div>
    </div>
    <div class='section' id='section-21'>
        <div class='docs'>
            <div class='section-link'>
                <a href='#section-21'>#</a>
            </div>
            <p><a href="../mha.html#MHA">බහු-ප්රධාන අවධානය</a> විමසුම නිර්මාණය කරනු ඇත, ප්රධාන සහ අගය ප්රක්ෂේපනය මොඩියුල <code  class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">query</span></code>
<code  class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">key</span></code>
, සහ <code  class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">value</span></code>
. </p>
<p>අපිඔවුන් එක් එක් කිරීමට අවකාශීය ගැඹුර-නැණවත් convolution ස්ථරය ඒකාබද්ධ හා වෙනුවට <code  class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">query</span></code>
<code  class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">key</span></code>
, සහ <code  class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">value</span></code>
. </p>
<p>📝 <em>මෙය සහ වැනිලා ට්රාන්ස්ෆෝමර් බහු-හිස අවධානය අතර වෙනස පැහැදිලිව පෙන්වන බැවින් මෙම පිරිසිදු ක්රියාත්මක කිරීම තේරුම් ගැනීම පහසු යැයි අපට හැඟේ</em>. </p>

        </div>
        <div class='code'>
            <div class="highlight"><pre><span class="lineno">130</span>        <span class="bp">self</span><span class="o">.</span><span class="n">query</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Sequential</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">query</span><span class="p">,</span> <span class="n">SpatialDepthWiseConvolution</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">d_k</span><span class="p">))</span>
<span class="lineno">131</span>        <span class="bp">self</span><span class="o">.</span><span class="n">key</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Sequential</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">key</span><span class="p">,</span> <span class="n">SpatialDepthWiseConvolution</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">d_k</span><span class="p">))</span>
<span class="lineno">132</span>        <span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Sequential</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="p">,</span> <span class="n">SpatialDepthWiseConvolution</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">d_k</span><span class="p">))</span></pre></div>
        </div>
    </div>
    <div class='footer'>
        <a href="https://papers.labml.ai">Trending Research Papers</a>
        <a href="https://labml.ai">labml.ai</a>
    </div>
</div>
<script src=../../interactive.js?v=1"></script>
<script>
    function handleImages() {
        var images = document.querySelectorAll('p>img')

        for (var i = 0; i < images.length; ++i) {
            handleImage(images[i])
        }
    }

    function handleImage(img) {
        img.parentElement.style.textAlign = 'center'

        var modal = document.createElement('div')
        modal.id = 'modal'

        var modalContent = document.createElement('div')
        modal.appendChild(modalContent)

        var modalImage = document.createElement('img')
        modalContent.appendChild(modalImage)

        var span = document.createElement('span')
        span.classList.add('close')
        span.textContent = 'x'
        modal.appendChild(span)

        img.onclick = function () {
            console.log('clicked')
            document.body.appendChild(modal)
            modalImage.src = img.src
        }

        span.onclick = function () {
            document.body.removeChild(modal)
        }
    }

    handleImages()
</script>
</body>
</html>